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Section 1: Introduction 


1. The Australian Bureau of Statistics (ABS) uses an enhanced version of the 
X-11 Variant of the Census Method II Seasonal Adjustment Program (Shiskin et. al, 
1967). The X-11 method applies moving average techniques to decompose the 
time series into estimates for the trend, seasonal and irregular components. 


2; A linear filter with symmetric weights (ie. a symmetric filter) has many 
desirable properties for seasonal adjustment. However, symmetric filters cannot be 
applied at the ends of a time series. The standard X-11 uses asymmetric filters to 
solve this problem. Asymmetric filters are often designed with assumptions about 
certain properties of the missing observations. ie. forecasts are implicitly applied. As 
a result of the forecasting, revisions will occur between the first few estimates of the 
seasonal factors and trend at a particular time point and the final trend estimate 
which is calculated using a symmetric moving average. The problem of the 
revisions derived from the application of asymmetric filters is often called the 
“end-weight" problem. 


3. Revisions resulting from the applications of asymmetric filters are necessary 
to improve seasonal adjustment estimation as more data becomes available. 
However, revisions are undesirable and methods to minimise them are an ongoing 
research pursuit for the ABS. A better forecast of the missing observations can lead 
to a reduction in the revisions of the seasonal adjustment estimates. 


4. Assuming the symmetric filters used in X-11 are satisfactory, the purpose of 
this paper is to investigate the possibility of reducing the "end-weight" problem by 
using X-11 with ARIMA extensions. In this paper, we 
1. evaluate the fitting performance of a set of the most often applied ARIMA 
models; 
2. confirm that a reduction in revisions is achieved when using ARIMA 
forecasting instead of the asymmetric filters in the standard X-11; 
3. identify the conditions under which an ARIMA performs better than the 
standard X-11. 


5; To achieve these goals a range of ABS time series (820 series) have been 
used. Our research showed that: 


1. In comparison with the standard X-11, on average, an ARIMA model 
achieves 6-7% reduction in revision to the first seasonally adjusted 
estimate; 


2. The average absolute percentage change in irregulars (denoted by 
STAR) is the most powerful statistical measure to predict the possible 
revision size; 


3. The ratio of the average absolute percentage change in the irregular 
component and the average absolute percentage change in the X-11 
trend (denoted by I/C) is the most powerful statistical measure to indicate 
when an ARIMA model is likely to be better than the standard X-11. The 
larger the I/C, the better an ARIMA model performs over the standard 
X-11. 


6. This report is organised as follows. Section 2 gives a short literature review of 
the "end-weight" problem. Section 3 presents a comparison of the revisions from the 
standard X-11 versus X-11 with ARIMA extension. Section 4 presents statistical 
analyses aimed at identifying how the revision is related to a set of common 
statistical measures, and which measures are most likely to predict if X-11 with 
ARIMA extension can produce less revision than the standard X-11. Our 
conclusions, recommendation and future research directions are discussed in 
Section 5. 


Section 2: Background 


Fi X-11 is a seasonal adjustment package developed by the US Bureau of the 
Census in the 1950's and 1960's (Shiskin et. al 1967, Ladiray and Quenneville 
2001). X-11 uses moving averages to decompose a time series into estimates of 
the trend, seasonal and irregular components. The trend and seasonal components 
use their own set of symmetric and asymmetric moving averages (filters) to calculate 
estimates in the middle and at the ends of the data. The details are outlined in 
Table 1. 


Table 1: Moving averages (filters) used in the X-11 seasonal adjustment package. 


pore Seasonal 


The symmetric moving The symmetric moving 
averages used to estimate javerages used for the 
the trend component are _—|seasonal are based on 
calculated using the convoluted simple moving 
Henderson moving average|averages eg. 3x5, 3 term of 
(Henderson, 1916) and the ja 5 term simple moving 
length of the filter is average (Dagum, 1996). 
determined by arelative |The "optimal" length of the 
Variation measure of the _—_|seasonal filter is 
irregular and trend. determined by a relative 
Variation measure of the 
irregular and seasonal. 


The asymmetric moving The asymmetric moving 

averages used for the ends |averages for the seasonal 

of the trend are calculated |moving averages are based 

using a simple linear model jon an unknown and 

developed by Musgrave undocumented 

(Doherty, 1992). methodology (see Appendix 
B for more details). 


8. There are two broad categories of methods to improve the revision 
("end-weight") problem: (1) improvement of asymmetric filters (Gray and Thomson 
1996a 1996b 1996c, McLaren and Steel 1998, Quenneville and Ladiray 2000), and 
(2) forecast the missing data using more advanced time series dynamic models. The 
latter approach is used for this report. 


9. The use of asymmetric filters results in the revision of initial estimates of the 
seasonally adjusted and trend as subsequent data points are added to the time 
series. The class of asymmetric filters in X-11 use implicit static forecast formulae. 
The optimal choice of an asymmetric filters is determined by the statistical 
characteristics of the time series. These characteristics are typically the average 
absolute percentage changes in irregular/seasonal (denoted by I/S) and the average 
absolute percentage changes irregular/trend (denoted by I/S). Therefore, the 
asymmetric filter approach used in X-11 is a semi-dynamic model in that the implicit 
forecast formulae are not completely dependent on the nature of the time series. For 


a further discussion of the asymmetric filter see Appendix B. 


10. Statistics Canada developed X-11-ARIMA in the 1970's and 1980's in an 
effort to reduce revisions. X-11-ARIMA uses Box and Jenkins ARIMA modelling 
(Box and Jenkins 1970, Dagum 1980) to forecast the original missing data points. 
Quenneville and Ladiray (2000) use models to forecast missing seasonally adjusted 
estimates and then apply the symmetric Henderson filter to produce trend estimates. 
The ABS evaluated X-11-ARIMA in the 1980's but found too many series were not 
automatically modelled by the package. It is concluded that X-11-ARIMA was not 
suitable for mass production of ABS seasonal adjustment. The US Bureau of the 
Census has recently released X-12-ARIMA (Findley et. al 1998), and also adopted 
ARIMA modelling to predict missing data points. X-12-ARIMA includes a 
regression-ARIMA component to estimate a range of data contamination effects. 


11. This analysis uses the ABS seasonal adjustment package SEASABS (ABS, 
1999) to clean the time series of any abrupt changes in the trend and seasonal 
components, trading day, large extremes and numerous other possible effects. 
X-12-ARIMA was then used to fit and evaluate the ARIMA extension method to 
improve the "end-weight" problem. We evaluated the revision properties of the first 
seasonally adjusted estimates. 


12. All series evaluated were seasonally adjusted directly. In practice, many ABS 
series are seasonally adjusted then aggregated to form an indirect seasonally 
adjusted estimate at a higher level. 


Section 3: Comparison of revisions between X-11 and different ARIMA options 


13. | The performance of the following four ARIMA modelling options were 
compared using X-12-ARIMA: 


1. the standard X-11 (using asymmetric filters) 

2. the "airline" ARIMA model (dynamic model) 

3. "super model" (static ARIMA model, see Appendix C for details) with 
prefixed parameters 

4. automatic selection of ARIMA model (see Appendix D). 


14. | These options let us to evaluate the performance of 
@ acommon dynamic ARIMA model (ie. the "airline" model), an empirical 
static ARIMA (ie. the "super" model) in comparison with the standard 
X-11, and 
e an "optimal" ARIMA model, which is automatically selected by the default 
model selection procedure in X-12-ARIMA (see Appendix D for more 
details), in comparison with the common “airline” model. 


The result of such comparisons will provide a clear and insightful picture of the 
strengths and weaknesses of the different "end-weight" methodologies. 


Additional investigations were performed using the "outlier" modification of 
regression-ARIMA in X-12-ARIMA. 


15. | The desired properties of revisions are: 


1. The first seasonally adjusted and trend estimates (ie. the estimates 
produced when the time period of interest is the last in the series) are as 
close as possible to the final estimates when more observations are 
added to the series (this may be some years after the initial data point was 
first added to the series). 

2. The seasonally adjusted and trend estimates subsequent to the first 
estimate converge as quickly and smoothly as possible to the final 
estimates. 


16. Revisions are calculated by simulating monthly (or quarterly) updates of 
historical data over a specific time span. The revision measure used in this paper is 
the mean absolute percentage revision in the level of seasonally adjusted estimates 
from the first to the final estimate. This is denoted by R, and defined as 


100 WG lAtr - Ati 
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R, = 
where A, is the seasonal adjusted estimate of date t 
given date t (t < t); T is the end date of the original series; t, is the start date of the 
original series minus seven years data (t, = SD - 84 for monthly series or SD - 28 for 
quarterly series; T, is less than or equal to T minus three years data (T, < T - 36 for 
monthly series or T - 12 for quarterly series); The generic form of this measure can 


be found in Appendix E. 


17. | To compare the performance of ARIMA options against the standard X-11, 
the percentage of revision ratio, RR = 100xR,(ARIMA)/R,(X11), between the ARIMA 


option to the standard X-11 has been found most effective. For example, 100 would 
mean that the ARIMA option and standard X-11 performed equally with regard to 
revisions, while 90 would mean the ARIMA option was 10% better than the standard 
X-11. 


18. |Table 2 summarises the results from a total of 820 time series from a variety 
of different sources. In general the "airline" model consistently outperforms the 
standard X-11 with an overall reduction in revisions about 6-7%. This is in spite of 
the fact that the "airline" model may not pass the rigorous model selection criteria. 
The static "super model" also outperforms X-11 overall. This implies that the 
forecast models implicitly embedded in the asymmetric filters of the standard X-11 
generally lead to larger revisions than the "airline" and "super" model. 


19. For the three X-11 with ARIMA extension (airline, super, automatic) seasonal 
adjustments, the use of the regression-ARIMA outlier removing option resulted in 
small gains averaging between 1.5% and 2.8%. However, the trade off was 
increased variability as measured by standard deviations of R,. 


20. __Incalculating the statistics for the automatic ARIMA model selection 
approach (with and without outlier options), all series which failed to meet the model 
fitting criteria were excluded. The automatic procedure fitted an ARIMA model for 
576 (ie. 70.2%) of the series. Of these, 101 series (ie 17.5%) performed worse than 
the standard X-11. The general performance of the automatic model selection option 
using X-12-ARIMA (see Appendix E for more details) is almost the same as the 
"airline" model. A general conclusion is that for Australian data, the sophistication of 
choosing a "best" ARIMA model is generally no better than just choosing the simple 
"airline" model (see Appendix F for more detailed comparisons between the 
performances of the "airline" and automatic models). 


Table 2: Means, medians and standard deviations for ARIMA/X11 revision ratio (%) 
number of | ARIMA/X11 revision ratios- group mean (medians in brackets ) fitted by 
i and standard deviations below auto 
airline jairline with super | automatic | automatic | model: 
outlier model with} model 
option outlier outlier | RR > 100 
option option | auto fit 
(as%) 
90.5 88.2 94.5 91.4 91.7 92.1 = 
(90.2) (88.6) (92.6) (90.4) (93.0) (93.0) 
8.5 9.0 12.1 11.9 7.6 10.3 47. a 


Labour 258 94.0 92.5 97.6 95.3 93.5 92.7 180 
force (93.6) | (92.1) | (96.3) | (94.3) | (94.1) | (93.6) 32 
7.3 73 8.4 8.4 7.0 7.0 82.2% 
Retail 202 93.1 92.3 93.4 91.5 93.7 93.3 197 
(93.4) | (91.1) | (97.9) | (93.3) | (93.6) | (93.7) 28 
6.3 7.5 6.1 8.1 6.4 7.5 81.7% 


Building 174 94.0 90.9 104.6 101.9 94.9 92.3 70 
activity (93.4) | (91.4) | (102.7) | (99.6) | (94.2) (92.5) 20 
8.4 11.3 13°7 16.2 7.7 10.2 51.7% 


93.8 82.8 95.8 84.6 94.7 83.3 44 


(92.5) | (83.7) | (94.8) | (86.8) | (93.6) | (83.4) 11 
8.2 14.7 8.1 172 7.0 16.6 82.1% 


89.3 89.3 89.6 88.8 88.5 97.0 . 
(88.7) | (87.8) | (90.3) | (89.0) | (88.9) el 7) 

5.0 5.3 5.5 5.2 5.7 35. 7 
93.6 93.5 94.2 93.6 93.4 7 7 
(92.8) | (92.6) | (94.2) | (94.1) | (91.9) os o 

4.2 3.9 Ad 4.6 4.0 33. 2 


103.2 96.5 101.5 100.7 97.1 8 2 
(104.6) | (99.9) | (99.3) | (100.7) | (97.5) | (88.1) 
10.9 17.8 13.5 23.5 12.1 15.8 sr 
99.4 94.0 104.9 103.4 97.3 95.0 
(99.4) | (99.4) | (106.8) | (105.4) | (95.2) | (94.8) 
12.1 16.9 12.4 18.8 8.5 15.8 7: ED 
96.7 94.4 100.9 99.1 91.4 85.4 
(100.8) | (97.8) | (104.5) | (103.1) | (89.7) Ge 5 
8.4 8.1 9.2 8.3 9.0 7% 
89.7 89.8 91.5 91.3 88.6 6 7 
ep .) (90 8) (91.2) | (91.2) | (87.4) eo o 
5.1 5.5 4.9 71 a 
33 8 ai 53 97.6 94.8 93.8 92 E 475 
i 2) C a (95.3) | (93.5) | (93.6) | (93.2) 101 
10.3 12.9 7.1 9.7 70.2% 


1. The selected sub-sample was formed by selecting series with a broad range of characteristics 
from the total set of analysed series. 

2. Total excludes the selected sub-sample. 

3. Percentages refer only to the number of series that did not fail. 


Section 4: Which statistical measures can predict if X-11 with ARIMA 
extension can produce less revision than the standard X-11. 


21. This section addresses three questions: 


1. Is the revision related to a set of common statistical measures? 

2. Does the X-11 with ARIMA extension (eg. the "airline" model) always 
perform better than the standard X-11 for any time series? 

3. Under what circumstances does the standard X-11 perform better than 
ARIMA models? What are the characteristics of a time series that can 
predict this? 


22. Aselection of five measures is given in Table 3. All of the measures are 
quantitative measures of the volatility of a series. Different measures provide 
information on the volatility of a series from different perspectives. The STAR value 
measures the volatility of series in the irregular component of a series; I/C and I/S 
ratio considers the trend and seasonality of a series; TP measures the frequency of 
cycles of a series rather than its smoothness; and Pouts is a particular measure of 
the number of outliers in a series which may affect the ARIMA performance. 


Table 3: Different measures used within the ABS. 


Measure[Description Source | 


Ls value represents the average absolute percentage change in the irregular 
STAR __ |(residual) component of a series. 
A ratio of the average absolute percentage change in the irregular SEASABS 
component and the average absolute percentage change in the X-11 trend. 
For an additive series it is the ratio of the average absolute change rather 
than percentage change in both cases. 


An indicator measuring the number of turning points in a series. In Created 
calculating this measure, we used the trend component of the series and 
calculated the 10 year average number of turning points for comparability 
among all series. 

A ratio measuring the proportion of total number of outliers over total number| Created 

of observations in a series. The outliers in a series are defined by X-11 


SEASABS. 


‘A ratio of the average absolute percentage change in the irregular SEASABS 
component and the average absolute percentage change in the X-11 

seasonal factors. For an additive series it is the ratio of the average absolute 

change rather than percentage change in both cases 


23. Weused binary tree recursive partition regression and simple regression 
(with non-linear transformation) to explore the relationships between the revision 
measure R, (generated by the standard X-11, and X-11 with ARIMA extensions) and 
the five statistical measures. All of the analyses consistently suggest that the STAR 
measure is the most significant predictor (with explanatory power of R > 0.9+) for 
the revision size of the three seasonal adjustment methods (the automatic model 
method is excluded because for 30% the series, X-12-ARIMA could not find a 
suitable ARIMA model) although all five measures have statistically significant 
effects (see Appendix G for more details). The result is intuitive in that the higher the 
measure of volatility, the larger the revisions. This is shown in Figure 1. 


Figure 1: Relation between mean absolute percentage revision in the seasonally adjusted 
estimates from the first to the final estimate (R,) and STAR 


Relations between STAR and Revisions 


X11 
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log(Revision) 


log(STAR) 


24. Figure 1 shows that the revisions from the “airline” model tend to be smaller 
than the standard X-11 when the STAR value increases. 


25. As the focus of the paper is on the relative performance of ARIMA methods to 
X11, we further investigated the possible statistical relationships between the 
percentage revision gain (defined as RR - 100) and the five measures. Figure 2 
shows the distributions of the revision gains of the "airline" and "super" model and 
the five measures. 


Figure 2: Empirical distributions 
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26. The skewed distributions of I/C and STAR suggest a logarithmic 
transformation is required when a linear regression analysis is performed. The three 
analyses (see details in Appendix G) of binary tree recursive partition of regression, 
transformed linear regression and logistic regression consistently show that the I/C 
is the most significant measure to predict the revision gains, followed by the STAR 
measure. The rest of the measures are not significant or inconclusive to the revision 
gains from the "airline" and "super" models. Table 4 shows the results of the linear 
regression fitting RR-100 to log(STAR), log(I/C) and I/S. 


Table 4: Estimated model: RR - 100 = a + b log(STAR) + c log(I/C) + d (I/S) 
Correlation of coefficients: Correlation of coefficients: 
a b a b 
lb = -0.6553 -0.1900 
ic §=6. -0.2776 ~=— -0.1270 0.0327 -0.1250 
-0.9531 -0.0087 -0.1218 


Estimated Revision Gain (%) ~ log(STAR) + log(I/C) Estimated Revision Gain (%) ~ log(STAR) + log(I/C) + I/S 
Airline Model Super Model 


27. — The estimated coefficients in Table 4 demonstrate that X-11 with ARIMA 
performs better than the standard X-11 when the I/C and STAR values increase. 
The estimated coefficients of STAR from the two ARIMA models are very close. This 
indicates that STAR measure has a relative consistent effect on the revision gain. 


28. More importantly, the I/C is the most significant measure. The negative 
coefficients of I/C in Table 4 suggest that the larger I/C, the better ARIMA are 
models than the standard X-11. The "airline" model is likely to have less revision 
when I/C is greater than 0.1675 (ie. log(I/C) is greater than -1.787 (see Graph A13 
in Appendix G)). The "super" model is likely to have less revision when I/C is 
greater than 1.047 (ie. when log(I/C) is greater than 0.0464 (see Graph A16 in 
Appendix G)). For the total 820 series, there are 816 (99.9%) and 616 (75%) series 
having I/C values greater than 0.1675 and 1.047 respectively. This implies that the 
asymmetric trend filter, which is derived from the I/C value, used in the X-11 
performs better than the two ARIMA models only when the I/C is extremely small (ie. 
the movements of the seasonally adjusted series are dominated by the trend 
movements). 


29. ‘The positive coefficient of I/S in Table 4 show that this measure has a positive 
effect on RR-100 (although it is not significant for the "airline" model). This suggests 
that the implied seasonal component model embedded in the ARIMA model may not 
perform as well as the X-11's seasonal asymmetric filter when a series is volatile 
and has less seasonal pattern changes. 


30. As discussed above, given a fixed noise level (ie. a fixed STAR) of a series, 
a suitable ARIMA model is likely to perform better than the standard X-11 when a 
series has less trend movement (ie. large I/C) and large seasonal pattern change 
(ie. small I/S). In other words, the noisier the series the better ARIMA model 
performs than the standard X-11 if a suitable ARIMA model can be found. 


31. ©Adynamic ARIMA model is a global model using a reasonable length of 
series to predict missing observations. The Musgrave asymmetric trend filter used in 
the standard X-11 is a local linear model using only the last few observations with a 
global parameter I/C to predict missing observations. Under a large I/C, a suitable 
ARIMA model is more likely to provide a better general indication of the trend 
movement while the Musgrave asymmetric filter tends to bend the trend movement 
towards zero. In addition a suitable ARIMA model is also likely to be more adaptive 
to seasonal pattern changes than the implied forecast formulae of asymmetric 
seasonal filters (see details in Appendix B). 


32. Based on our results, and earlier results of Statistics New Zealand (Statistics 
New Zealand, 2000), it is possible that ARIMA modelling is better than the X-11 
asymmetric filters when a time series is more volatile (large I/C and STAR) and 
strong in seasonal movement (small I/S), or vice versa. 


Section 5: Conclusions and future directions 


33. | Weused the dynamic “airline” and static "super" ARIMA models to show that 
the implied forecast model of the asymmetric filter in the standard X-11 does not 
perform as well as the ARIMA models in terms of revisions to the seasonally 
adjusted estimates. In other words, an imperfect dynamic ARIMA model still 
produces less revision than the standard X-11 does. 


34. | We have shown that the revision size in seasonal adjustment is dependent on 
a range of volatility measures (STAR, I/C, I/S, TP, Pout) of a series. STAR is the 
most dominant measure for predicting the revision size. An empirical non-linear 
relationship between the average percentage revision and STAR can be 
established. 


35. We have identified the condition under which the X-11 with ARIMA extension 
is most likely to have less revision. This condition is that a time series is relatively 
volatile (ie. with relative large values of I/C and STAR) and strong in seasonality (ie. 
small I/S). In addition, we also have a better understanding of when the standard 
X-11 is likely to perform better when a suitable ARIMA cannot be found. 


36. Our research suggests that future research for reducing the "end-weight" 
problem should be focused on: 


1. Tuning or relaxing the current ARIMA model selection criteria or 
constructing appropriate criteria to identify a suitable model that still 
ensure a reduction in revisions. 


2. Changing the standard X-11 default option if a suitable ARIMA model 
cannot be found. eg. using the standard X-11 when I/C and STAR is 
relatively small, otherwise, using the "airline" model. 


3. Choosing an optimal length of data to balance the "global" nature of an 
ARIMA model with local dynamics at the end of series. 
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Appendix A: The sources of time series and their characteristics 
The table below summaries characteristics of the 820 series used in the analysis. 


Table A1: The sources of time series and their characteristics 
number of series period data span stock/flow 
analysed in group 


Labour Force monthly (229) 23 yrs stock 
quarterly (17) 15 yrs 

Retail monthly 19 yrs flow 

(1 series 39 yrs) 
Employment and 67 monthly 18 yrs stock 
Earnings (1 series 12 yrs) 
New Motor Vehicle monthly 34 yrs flow 
Registration 
Stocks and Sales quarterly 12-17 yrs stock/flow 
Job Advertisements (a |] monthly 10-26 yrs 


Notes: 

1. The selected sub-sample was formed by selecting series with a broad range of characteristics 
from the total set of analysed series. 

2. Total excludes the selected sub-sample. 


Appendix B: Asymmetric weights versus forecasts 
Simple example for Seasonalxlrregular (SI) values of a particular month or quarter 


Let Sl, for t=1,..,n (year) be the data. 


Assume a symmetric 3X5 moving average used in the standard X-11 to produce the 
seasonal factors 


W.,,=1/15, W.,,=2/15, W.,,=3/15, W,,=3/15, W, =3/15, W,,=2/15, W,=1/15 


To get seasonal factors for S_.,S_,,5,, X-11 uses asymmetrical weights that only 
involve the recent past but not the future. These are given by 


n-2 (third last point) 
W.,,=4/60, W., ,=8/60, W_, =13/60, W, ,=13/60, W, ,=13/60, W, ,=9/60 
S,2=W.,,xSI,., +W., ,xSI., +W_, ,xSI, +W,,xSI_. +W, ,xSI_, +W_,,xSl, 
n-1(second last point) 
W.,,=4/60, W.,,=11/60, W_,,=15/60, W, ,=15/60, W, ,=15/60 
S,..=W.,xSI,., +W.,.xSl,, +W_,,xSl, +W,,xSI,, +W,.xSl., 
n (last point) 
W.,,=9/60, W.,,=17/60, W_, =17/60, W, =17/60 


S,=W.),.xSl., +W._,,xSl,. +W., xSI,, +W,.xSl, 


These asymmetric weights can also be looked at as forecasts of the SI's and then 
applying the symmetric moving average. For the case above we have 


SI_,=0.25xS1_+0.25xSI,_+0.25xSI_.+0.25xSI._, 
SI, ,,=0.25xS1_+0.25xSI,.+0.25xSI_,+0.25xSI._, 
SI,.s=Sl,.2 


It can be noted that different weighting patterns may be used to different forecast 
horizons, and in the case of the standard X-11 the forecast is a little bit bizarre. 


ARIMA modelling 


An alternative to the asymmetric weights is to fit an ARIMA model to the SI's. 
Consider a simple ARIMA model given by 


(1-B)SI=(1-6B)E, 


where B is the backward shift operator, and 6 is an unknown parameter and E, ~N(0, 


o). In this case the Sl's are forecasted using a linear combination of the historical 
Sl's determined by the differencing and moving average parameter. 


The symmetric moving average is then applied to the forecasted SI's. Because of 
the control over the differencing and moving average parameter a much larger range 
of filters are available. The ARIMA modelling case could be converted back to 
asymmetric filters. Generally the asymmetric ARIMA filters would go back a long 
way into the past (with small weights), but this could be controlled by using a "local" 
ARIMA model or using a purely autoregressive model. 


Appendix C: ARIMA "Super Model" 


The "super model" is an empirical ARIMA model with fixed parameters. It has been 
included in this paper to see how a static ARIMA model performs against the 
standard X-11. To estimate such a model a possible solution is to combine many 
series to find a more suitable general model that can capture collection wide 
characteristics that could not be detected in individual time series. 


The form of the ARIMA "super model" is given by 


monthly (0,1,13)(0,1,0)  (1-B)(1-B”)log(O,)=(1-0,B-0,B’...-0,,B’—0,,.B)E, E, ~N(0,o,,) 
quarterly (0,1,5)(0,1,0)  (1-B)(1-B")log(O,)=(1-0, B—0,B —-9,B -0,B—-9.B)E, E, ~N(0,0,) 


The parameters were estimated by maximising the sum of log likelinoods over a few 
hundred time series simultaneously. They are given in the table below 


Table A2: Super model coefficients 
| lagi | lag2 | lag3 | lag4 | lags | | 
| 0.39627 | 0.14481 |-0.0079364| 0.66054 | -0.19403 | | 
| lagi | lag2 | lag3 | lag4 =| lag | lag | lag 


| lag8_| lag9_ | lagto | lagti | lagi2 | lagis | | 
|-0.0010558 | -0.023753 | -0.0019553]-0.0080748] 0.72490 | -0.23656 | 


Appendix D: Automatic modelling X-12-ARIMA 


Five default models 


Models chosen by empirical research on a large selection of data. See (Dagum 


1988) for further details. 


Model 1 (0 1 1) (011 
Model 2 (0 1 2) (011 


) ) a -B’)log(O,)=u+(1-0,B)(1—-0,B’)E, 
) ) 
Model 3 (2 1 0) (0 1 1) 
) ) 
) ) ( 


X (1- 

X (1- ee (1-6,B—0,B’)(1—0,B")E, 

X (1 $e $8) -B)'(1-B")log(O,)=u+(1—9,B")E, 

X (1- 0 ue -§,B—0,B’)(1—0,B")E, 
B’)log(O,)=(1-6,B—0,B )(1—8,B”)E, 


Model 4 (0 2 2) (0 1 1 
Model 5 (2 1 2) (011 


1- pee o,B 
[Note: X signifies mean to be estimated] 
Estimation - exact maximum likelihood 


Set of selection criteria 


The following set of criteria are used based on forecasting performance, white noise 
properties of the residuals and cancellation of parameters on the moving average 


side with the differencing in the ARIMA model. 

(A) — within sample forecast error (<15% default) 

(B) Box-Ljung Q statistic (>5% default) 

(C) over differencing on MA parameters (default <0.9) 
Options for "best" model 

(A) _ first model, in order, that passes selection criteria 


(B) best model in terms of within sample forecast error that passes 
selection criteria. 


Appendix E: Revision measure used in paper 
The generic formula for a lag k (k = 0, 1, 2, ..., m) average absolute percentage 
revision from start point t, (t, is greater than the start date of series) to the end 
point T, (T, is less than the end date of series) is defined as: 


The mean absolute percentage revision R, on specific lags is calculated as 


R = 100 y Ar — Atiux 
S T-tot1G Ag 


k (0 <k< 36 for monthly series or 12 for quarterly series) is the lagged period, 
where k=0 represents the 1st estimate, k=1 represents the 2nd estimate etc; 


T is the end date of the original series; 


t, is the start date of the original series minus seven years data (t, = SD - 84 for 
monthly series or SD - 28 for quarterly series; 


T, is less than or equal to T minus three years data (T, < T - 36 for monthly 
series or T - 12 for quarterly series); 

A.,;., is the lag k estimate of concurrent seasonally adjusted series at date t (eg. 
if k=O and t=Aug 98 it represents the 1st estimate of Aug 98), the value of the 
seasonally adjusted series at time tcalculated from the start date of the series up 
to T, + k (when t=T, ). 


A,,, 's the "final" value of the seasonally adjusted series at time t, the value of 


the seasonally adjusted series at time t calculated from the start date of the 
original series to the end date T; 


Appendix F: Diagram showing the performance of the automatic model 
selection within X-12-ARIMA 


Figure A1: Automatic ARIMA modelling with outliers turned off. 


Total series: (820) 
Auto: (782) 93.4 
Airline: (816) 91.3 


Auto modelling worked 
(782) 
Auto: 93.4 
Airline: 91.5 


Picked a model (676) 
Auto: 92.3 
Airline: 91.1 


Auto model is not Airline 
(518) 
Auto: 92.4 
Airline: 90.8 


Auto better than Airline 
(199) 
Auto: 90.1 
Airline: 93.1 


Note: the number in a pair of brackets is the number of series. The figure against a 
particular model is the RR of the model against the standard X-11. 


Auto modelling failed 
(38) 


Airline worked (34) 
Auto: NA 
Airline: 87.9 


Reverted to X11 (106) 
Auto: 100.0 
Airline: 93.6 


Auto model is Airline 
(158) 
Auto: 92.2 
Airline: 92.2 


Auto worse than Airline 
(319) 
Auto: 93.7 
Airline: 89.4 


Airline failed (4) 
Auto: NA 
Airline: NA 


Figure A2: Automatic ARIMA modelling with outliers turned on. 


Total series: (820) 


Auto: (805) 95.6 


Airline: (817) 93.8 


Auto modelling worked 
(805) 
Auto: 95.6 
Airline: 93.8 


Picked a model (576) 
Auto: 93.8 
Airline: 94.0 


Auto model is not Airline 
(421) 
Auto: 93.7 
Airline: 93.9 


Auto better than Airline 
(197) 
Auto: 92.9 
Airline: 95.2 


Note: the number in a pair of brackets is the number of series. The figure against a 


Auto modelling failed 
(15) 


Airline worked (12) Airline failed (3) 
Auto: NA Auto: NA 
Airline: 94.6 Airline: NA 


Reverted to X11 (229) 
Auto: 100.0 
Airline: 93.2 


Auto model is Airline 
(155) 
Auto: 94,2 
Airline: 94.2 


Auto worse than Airline 
(224) 
Auto: 94.4 
Airline: 92.9 


particular model is the RR of the model against the standard X-11. 


Appendix G: Statistical Analysis 


To explore the possible relationship in predicting revision size with the statistical 
measures, we have analysed five selected measures (STAR, I/C, I/S, TP and Pouts) 
against two revision measures, including R, (mean absolute percentage revisions), 


and RR (revision ratios between the R, from X-11 with ARIMA extension against the 
R, from the standard X-11). 


0. Descriptive Statistics 


Figure A3: Empirical distributions of measures: R,, Revision Gains (%), STAR, TP, I/C, I/S, and 
Pouts 
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The skewed distributions of R,s, I/C and STAR indicate a logarithmic transformation 
is required when a linear regression analysis is performed. 


Figure A4: Box plots of R,, Revision Gains (%), STAR, TP, I/C, I/S, and Pouts 
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To compare the distribution properties of the two ARIMA models against the 
standard X-11, we create a binary response variable, Y as: 


__|0(FALESE) if RR >100, ie. the revision of X-I1 is smaller 
“| 4 (TRUE) _ if RR <100, ie. the revision of X-11 is larger. 


Figure A5: Box plots - Airline vs X11 against STAR, TP, I/S, I/C, and Pouts 
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Figure A6: Super vs X11 against STAR, TP, I/S, I/C, and Pouts 
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Figure A5 and A6 show that the medians, and inter-quartile ranges of Pout of a 
FALSE and TRUE pair are more or less the same for both the "airline" and "super" 
models. This indicates that Pout is unlikely to have any effect on revision gain. 
However, the medians and inter-quartile ranges of STAR and I/C have 
distinguishable differences. This indicates that STAR and I/C may have certain 
explanatory power. 


We use three regression methods (1) binary tree recursive partition regression (2) 
simple linear regression (with non-linear transformation) and (3) logistic regression 
to explore and test the relationships between revision measures (R,, RR or (RR-100) 


) and the five statistical measures. 


1. Recursive Partition Regression Tree Analysis 


A tree based recursive partition regression is a statistical technique to explore the 
possible relationships between a dependent variable to its explanatory variable 
(Breiman et. al, 1984). Our primary purpose is to explore the possible relationship 
between: (1) revision R, and (2) the binary response variable Y with the five 


statistical measures. 


Revision measure R;: 

The following tree analyses used the revision measure R, from the standard X-11, 
X-11 with “airline” and "super" models respectively as the dependent variable and 
the five statistical measures as independent variables. The primary split on STAR 
suggests that the STAR value of a series is the best possible measure to explain the 
size of revisions. 


Figure A7: X-11 R, regression tree 
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Note: the length of a vertical branch pair indicates the relative explanatory power of 
the split. 


Figure A8: Airline model R, regression tree 
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Note: the length of a vertical branch pair indicates the relative explanatory power of 
the split. 


Figure A9: Super model R, regression tree 
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Note: the length of a vertical branch pair indicates the relative explanatory power of 
the split. 


Binary response variable Y: 

The following tree analyses use the binary response variable Y (i.e., the TRUE and 
FALSE factor variables) from Airline and Super model respectively as dependent 
variable and the five statistical measures as independent variables. The primary 
splits on I/C and then STAR suggest that the I/C and STAR values of a time series 
are the best possible explanatory variables to explain the possibility that X-11 with 
ARIMA extension produces less revisions than the standard X-11. 


Figure A10: Airline model Y regression tree 
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Note: the length of a vertical branch pair indicates the relative explanatory power of 
the split. 


Figure A11: Super model Y regression tree 
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Note: the length of a vertical branch pair indicates the relative explanatory power of 
the split. 


2. Linear regression analysis 


Linear regression models (with log-trasformation for some measures) are used to 
explore the relationships between the five statistical measures and three revision 
measures, including revision R, , revision ratios RR (or revision gains). The skewed 


distributions of R,, STAR and I/C suggest non-linear transformations are needed. 
We take a log-transformation for R,, STAR and I/C in the linear regression analysis. 


Revision R;: 

Table A3 shows the results of log R,, which are derived from the standard X-11, 
X-11 with "airline" and "super" models, regressed on log(STAR), log(I/C), I/S, TP 
and Pout. All estimated coefficients are significant at the 5% level and the fitness 
measure R's show that the model adequately explains the variations of R, from the 
five explanatory measures. 


Table A3: Regression model R, =a + b log(STAR) + b log(I/C) + c (I/S) + d (TP) + e (Pout) 


[| sRofxt [Rot Arinemodel | R,of Supermodel | 
| | coef. value | t-value | coef.value | tvalue | coef.value | tvalue | 
(Intercept) 
log(STAR) 


log(VC) 
E 


Note that the sign of the estimated coefficients are the same across each R,. This 
indicates that the five explanatory measures have the same effect on R,, although 
with different magnitudes. STAR, TP positively contribute to the revision measure R, 
, whilst I/C, I/S, and Pouts negatively contribute to the revision measure R,. These 
can be interpreted as 

the more volatile (STAR), the larger the revision; 

the more turning point (TP), the larger the revision; 

the less movement in trend than in irregular (I/C), the smaller the revision; 

the less movement in seasonal than in irregular (I/S), the smaller the revision; 
the more data points modified by X-11 as outliers(Pouts), the smaller the 
revision. 

All the above results are intuitive and provide insightful information about how the 
five measures contribute to R,. Although all five measures have a significant effect 


on revision R,, STAR is the most dominant explanatory measure. 


Revision gain (RR - 100): 

Table A4 and A5 show the results of regression analyses of revision gain, RR - 100, 
on four statistical measures for the "airline" and "super" models. The results 
demonstrate that I/C is the most significant measure to explain the variation of 
revision ratios RR (or revision gains). However, the R’ statistics indicate that the 
models used do not necessary fit well. 


Table A4: Linear regression using revision gain as response variable: Airline model 


revision gain =a+b log(star) : -5.061  (-12.823%*) 
> -1.191 (-4.158*) 


revision gain =a+b log(I/C) : (-16.974*) 
: (-7.649*) 


revision gain =a+b (I/S) (-3.874") 


(-0.738) 


revision gain =a+b log(star) +c log(I/C) : (-10.715*) 
: (-3.342*) 
(-7.212*) 


revision gain =a+b log(star) +c log(I/C) + d (I/S) : (-3.4000*) 
(-7.1744") 
(-3.3413") 

(0.1672) 


Figure A12: Airline model - STAR contribution to RR - 100 


Revision Gain (%) vs Star: Airline Model 
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Figure A13: Airline model - I/C contribution to RR - 100 


Revision Gain (%) vs I/C: Airline Model 
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Figure A14: Airline model - I/S contribution to RR - 100 


Revision Gain (%) vs I/S: Airline Model 
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Figure A15: Airline model - Estimated revision gain 


Estimated Revision Gain (%) ~ log(STAR) + log(I/C) 
Airline Model 


Table A5: Linear regression using revision gain as response variable: Super Model 


Coefficient (t Value) | oR 


revision gain =a+b log(star) : -0.857 (-1.609) 
: -1.625 = (-4.203*) 0.0223 


revision gain =a+b log(I/C) : 0.337 (0.916) 

: -7.265 (-15.898*) | 0.246 
revision gain =a+b (I/S) : -12.806 (-7.194"*) 

: 2.220 (5.925%) 0.043 


revision gain =a+b log(star)+c log(I/C) : 1.228 = (2.531%) 
: -0.955  (-2.802*) 0.254 
: -7.101 = (-15.483*) 


revision gain =a+b log(star)+c log(I/C)+d (I/S) : -12.175 — (-8.000*) 
: -0.981 (-3.031*) | 0.328 
: -7.595 (-17.310*) 
: 2.928 (9.240*) 


* indicates t statistics are significant at the 5% level. 


Figure A15: Super model - STAR contribution to RR - 100 
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Revision Gain (%) vs Star: Super Model 


Figure A16: Super model - I/C contribution to RR - 100 
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Revision Gain (%) vs I/C: Super Model 


Figure A17: Super model - I/S contribution to RR - 100 


Revision Gain (%) vs I/S: Super Model 
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Figure A18: Super model - Estimated revision gain 


Estimated Revision Gain (%) ~ log(STAR) + log(I/C) + I/S 
Super Model 
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From the above linear regression analyses, we conclude that the "airline" and 
"super" models are likely to have less revisions than the standard X-11 has when I/C 
and STAR increase. The positive coefficients of I/S show that the I/S has a positive 
effect (although it is not significant for the "airline" model). This indicates that the 
implied seasonal component model embedded in the both ARIMA models may not 
perform as well as the X-11's seasonal asymmetric filter when a series has relatively 
less seasonal pattern changes comparing to its irregular. The significant positive I/S 
coefficient for the "super" model also shows this static model performs worse than 
the more dynamic "airline" model for the seasonal component. In comparison with 
the standard X-11 on the 820 series, the net gain from an ARIMA model comes from 
better trend forecasts. This gain is usually larger than the loss from its poor seasonal 
forecasts that occurs when the volatility of a series increases. 


3. Logistic Regression Analysis 

Logistic regression analysis is often used to investigate the relationship between a 
discrete response variable and a set of explanatory variables (Collett,1991). For the 
created binary response variable, Y (see definition on page 25), a logistic regression 
model with the 4 statistical explanatory measures allow us to robustly test which 
explanatory measures significantly affect the probability of the response variable. 


Statistical results of logistic regression models are reported in Table A6, which 
suggest that I/C and STAR significantly affect the probability of the binary response 
variable Y. 


Table A6: Logistic Regression of binary Y on four explanatory measures 
Intercept Log(STAR) V/S Log(I/C) 
(Itl) (Itl) (Itl) (Itl) 
1.3046 0.2954 -0.1155 0.9142 


(2.59)* (2.84)* (1.18) (6.53)* 

2.6411 0.4339 -0.6105 1.7503 

(5.18)* (4.48)* (5.98)* (10.98)* 
* indicates statistics are significant at the 5% level. 


From the above logistic regression analyses, the positive coefficients of log(STAR) 
and log(I/C) indicate that the two ARIMA models are likely to have a higher 
probability of smaller revision than the standard X-11 has when I/C and STAR 
increase. The negative coefficients of I/S indicates that the two ARIMA models are 
likely to have a lower probability of smaller revision than the standard X-11 has 
when I/S (although it is not significant for the "airline" model) increases. These 
results are consistent with the linear regression (with transformation) analyses. 


